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Abstract. Along the life of the lUE project, a large 
archive with spectral data has been generated, requiring 
automated classification methods to be analyzed in an ob- 
jective form. Previous automated classification methods 
used with lUE spectra were based on multivariate statis- 
tics. In this paper, we compare two classification methods 
that can be directly applied to spectra in the archive: met- 
ric distance and artificial neural networks. These methods 
are used to classify lUE low-dispersion spectra of normal 
stars with spectral types ranging from 03 to G5. The clas- 
sification based on artificial neural networks performs bet- 
ter than the metric distance, allowing the determination 
of the spectral classes with an accuracy of 1.1 spectral 
subclasses. 

Key words: methods: data analysis - techniques: spec- 
troscopic - stars: fundamental parameters 



1. Introduction 

The availability of large spectral archives and the effi- 
ciency achieved with modern instrumentation require au- 
tomated classification methods to improve the classifica- 
tion by visual inspection. These methods are still in ex- 
ploratory phase; the aims are clear, to devise an objective, 
repeatable and robust classification scheme providing an 
estimation of systematic and random errors, and allowing 
the quantification of spectral resolution and signal to noise 
ratio on classification errors. 

Automated classifiers of stellar spectra can be divided 
into metric distance algorithms, multivariate statistics and 
artificial neural networks (hereafter ANN). Metric dis- 
tance methods were originally proposed by Kurtz and 



LaSala (Kurtz 1982 



1984, LaSala 1994) and have been 



used by Penprase (1994) to classify stellar spectra us- 
ing the digital spectral atlas of Jacoby et al. ( |1984D as 
template. Multivariate statistical methods are linear algo- 
rithms used for exploratory data analysis. These methods 
have been applied to spectral classification by using Prin- 
cipal Component Analysis (PCA) to reduce the dimension 
of the problem, followed by Cluster Analysis (CA) to dis- 
cover groups of objects in the param eter s pace obtained 
in the previous step (Murtagh & Heck 1984 and references 



herein) . Stellar classification with ANN is a new approach 
that has been used by von Hippel et al. ( 1994a , 1994b ) 
to confirm the visual classification of the Michigan Spec- 
tral Catalogue on objective prism spectra, determining the 
temperature classification to better than 1.7 spectral sub- 
classes from B3 to M4. Gulati etal. (1994) classify the 
spectral atlas of Jacoby et al. (1984) with an accuracy of 
2 spectral subclasses, based on selected spectral features. 
The availability of the lUE low-dispersion archive 



(Wamsteker et al. 1989) allows the application of pattern 
recognition methods to explore the ultraviolet domain. 
The analysis of this archive is especially interesting, due 
to the homogeneity of the sample. As indicated by Heck 
(1987), it is important to remember at this point that 
MK spectral classifications defined from the visible range 
cannot simply be extrapolated to the ultraviolet spectral 
range. So far, only multivariate statistical methods have 



been used with lUE spectra. Egret and Heck (1983) an- 
alyze the relative fluxes at 16 selected wavelengths of O 



and B stars with PCA. Egret et al. (1984) analyze low- 
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resolution spectra using PCA on 93 variables computed 
as median fiux values at certain wavelength bands and 
selected absorption and emission lines. These analyses in- 
dicate a high correlation between the first princi pal com- 
ponent and the temperature. Heck et al. ( 1986 ) classify 
the l UE L ow-Dispersion Spectra Reference Atlas (Heck 
et al. 1984) of normal stars. Weighted intensities of sixty 
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lines together with an asymmetry coefficient describing 
the continuum shape are used for the classification. The al- 
gorithm consisted of PCA followed by CA to define differ- 
ent groups that confirmed the manual classification of the 



Atlas. Imadache and Creze (1990) and Imadache (1992) 



extend the sample and generalize the method, using the 
full spectral range instead of pre-selected spectral features. 

The present work has been done within the context 
of the lUE Final Archive project. The aim is to provide 
an efficient and objective classification procedure to ex- 
plore the complete lUE database, based on methods that 
do not require prior knowledge about the object to be 
classified. Two methods are compared: a simple metric 
distance method, and a supervised ANN classifier. The 
input sample and data preparation steps are described 
in Sect. 2. The classification method based on metric dis- 
tance is summarized in Sect. 3. Section 4 explains the clas- 
sification method using ANN. The results obtained with 
both methods are described in Sect. 5, and the conclusions 
are presented in Sect. 6. 

2. The sample spectra 

The spectra were taken from the lUE Low-Dispersion Ref- 
erence Atlas of Normal Stars (Heck et al. 1984, hereafter 
the Atlas), covering the wavelength range from 1150 to 
3200 A. The Atlas contains 229 normal stars distributed 
from the spectral type 03 to KO. The classification given 
in the Atlas was carried out following a classical morpho- 
logical approach (Jaschek & Jaschek 1984), based on UV 
criteria alone. The set of 64 standard stars selected in the 
Atlas, with spectral types from 03 to G5, was used as 
a template in the metric distance classification and was 
the training sample in ANN classification. The test set 
contained 163 spectra, excluding the 64 standard stars 
and two stars with spectral types G8 and KO, outside the 
spectral types covered by the training set. 

The spectra were obtained by merging together data 
from the two lUE cameras, sampled at a uniform wave- 
length step of 2 A, after processing with the standard 
calibration pipeline. Although the spectra are good in 
quality, there are two aspects that seriously hinder the 
automated classification: interstellar extinction and con- 
tamination with geo-coronal Ly-a emission. Some pre- 
processing was required to eliminate these effects and to 
normalize the data. 

All spectra were corrected for interstellar extinction by 
using Seaton's ( 1979| ) extinction law. Due to the properties 
of the extinction law at Ai — 1600, A2 — 2400 and Ac — 
2175 A the color excess E{B — V) was estimated as 



E{B -V)^ 1.368 log 



225/1° + 575/1 



800/° 



(1) 



The observed fiuxes /{", /2 and /° were obtained by fil- 
tering the high frequency components in the transformed 
Fourier space (LaSala and Kurtz |l985| ). Figure ^ shows a 



typical 04 spectrum before and after correction for red- 
dening. 



Original 



.^J^ 1 



A I 



Selected range 



After de-reddening 



1200 1600 2000 2400 2800 3200 
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Fig. 1. Original (top) and de-reddened (bottom) spectra cor- 
responding to a 04 star. The selected range is indicated by the 
solid lines in the middle 



The region below 1250 A was excluded from the anal- 
ysis, to eliminate the geo-coronal Ly-a component, and 
also the spectral band from 1950 to 2350 A, because of the 
low signal to noise ratio in this region. The selected wave- 
length range is indicated by the solid lines in Fig. |l|. The 
resulting spectra contained N = 744 flux values, which 
were normalized to obtain a mean flux value of zero and a 
sum of the absolute values of the normalized fluxes equal 
to one. 



3. Classification using metric distance 



Normalized spectra are considered vectors in IR^ and a 
metric is introduced in the vector space. In this classifica- 
tion scheme the metric distance between the object spec- 
trum and each spectrum in the training set is computed 
and the spectral class of the star in the training set having 
the minimum distance is assigned to the object. 

Let fij — /i(Aj); j = 1, . . . , be the flux of the i-th 
star in the catalogue and Skj = Sfc(Aj); j — 1, . . . , N he 
the flux of the k-th standard star in the training set, after 
correction for reddening and normalization. The distance, 
dik, is defined by 



1 ^ 



(2) 



4. Classification using ANN 



A supervised classification scheme based on artificial neu- 
ral networks (ANN) has been used. This technique was 
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originally developed by McCuUogli and Pitts (1943) and 
has been generalized with an algorithm for training net- 
works having multiple layers, known as back-propagation 



(Rumelhart et al. 1986). In a general form, a neural net- 



work represents a function, F, that maps a given input 
set into a selected output set. Assuming that normal- 
ized spectra are elements in a A^-dimensional vector space 
and spectral classes are defined by M-dimensional classi- 
fication vectors, the network approximates the mapping 
F : M^^ in such a way that a standard star, 

Sk, is associated with the vector Ck = F{sk)- The output 
vector defines the spectral class so that OO is given by 
(1, 0, 0, ... , 0), 01 is represented by (0, 1, 0, . . . , 0) and so 
on. In this form, the network can be regarded as a clas- 
sifier that maps the input space of normalized fluxes into 
1 of Af , i.e., one output unity and all others zero. Such a 
network, with a squared-error cost function, gives a good 
estimation of Bayesian probabilities (Richard & Lippmann 



1991), so that for an input spectrum /, the i-th compo- 



nent of the output vector is the probability P{Ci\f) for 
class i given the input spectrum. 

The supervised classifier works in two phases: During 
a first step, the learning phase, the classifier is trained 
with the standard stars in the Atlas together with the as- 
sociated output vectors defining the spectral classes. In a 
second step, the test sample is presented to the network 
and spectral types are assigned as defined by the maxi- 
mum value of the probability distributions estimated by 
the classifier. 

The network architecture consists of an input layer, 
one or more hidden layers and the output layer. The input 
layer contains N nodes that accept the individual compo- 
nents of the input vector and distribute them to the nodes 
in the second layer. Nodes in a layer receive the weighted 
sum of the output from all the nodes in the previous layer, 
so that the input of node j is 



(3) 



where yi is the output of node i in the previous layer and 
Wji is the weight associated to the connection of node i 
to node j. The output of node j is computed using the 
sigmoid transfer function 



1 



Vj = 



1 



(4) 



During the training phase, input vectors with normal- 
ized fluxes of the standard stars are presented to the net- 
work and Eqs. (^ and (^ are applied in a feed-forward 
mode, until the output layer with M nodes is reached. 
For each star in the training set, the output vector o is 
compared with the desired classification vector c, defining 
the spectral type of the standard star, and the error is 
evaluated as 



1 



M 



(5) 



The minimization of the error is achieved during the 
training phase by changing the connection weights accord- 
ing with the error feedback mechanism, known as back- 
propagation (Rumelhart et al. 1986). During this step, 
connection weights Wji are updated using the rule: 



Awji{t) = -r] 



dE 
dwji{t) 



+ aAwji(i - 1) ^ 



(6) 



where rj is the learning rate, a is the momentum factor 
to reduce oscillations during the learning process, t is the 
iteration number and E is the error, given by Eq. (^). 

The procedure was repeated for all spectra in the train- 
ing set in several iterations. The training set was sampled 
randomly before each iteration, to avoid trends during 
the learning phase. After each iteration, the average error 
for all the stars in the training set was computed using 
Eq. (||), to control the convergency of the procedure. Af- 
ter 2000 iterations there was no substantial improvement 
in the classification. To prevent excessive corrections dur- 
ing the first iterations, the parameters rj and a were lin- 
early increased with the iteration number, reaching the 
operational values of 0.1 and 0.5, respectively, after 100 
iterations. 

In our study we have used several network configura- 
tions, with an input layer consisting of iV = 744 nodes, 
corresponding to the number of spectral flux values, one 
or two hidden layers and an output layer having M = 51 
output nodes. A total of six different network topologies 
were used in the study with 30, 60 and 120 nodes in the 
hidden layers. In addition, different output distributions 
were used during the training phase, by convolving the 
discrete 5 functions representing the class probabilities 
with gaussian distributions having different standard devi- 
ations. The best results were obtained for a standard devi- 
ation of 0.7. Table 1 summarizes the classification statis- 
tics for each configuration, the first two columns define 
the network topology, the third column shows the cor- 
relation coefficient, r, between the classification obtained 
with ANN and the manual classification given in the Atlas, 
and the fourth column is the standard deviation, cr, of the 
differences between the two classifications. The network 
topology 744 xl20xl20x51 produces the best classifica- 
tion. This network was used in the analysis below. Figure |^ 
shows the total error, given by Eq. (|^), as a function of the 
number of iterations for the 744 xl20xl20x51 topology. 

Both methods were applied to the test set of 163 stars 
in the Atlas, excluding the 64 standard stars. In the metric 
distance method, the class of a test star was determined 
by the minimum distance given by Eq. (|^) , there is no way 
to assess the quality of the classification for a given star. 
In the classification using ANN, fluxes of a test star are 
mapped into the classification space, producing an output 
vector that estimates the Bayesian probabilities for each 
class. The spectral type is defined by the maximum value 
of this vector. In addition, the probability distribution pro- 
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Table 1. Network Configurations 



Hidden 


Hidden 






Nodes 


Layers 


r 


a 


120 


2 


0.988 


1.107 


60 


2 


0.983 


1.350 


30 


2 


0.946 


2.379 


120 


1 


0.984 


1.313 


60 


1 


0.986 


1.221 


30 


1 


0.982 


1.378 



b A 
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0.40 - 
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Fig. 2. Average error as a function of the iteration number for 
ANN classifier with 744 x 120 x 120 x 51 topology 



vides information on the quality of the classification for 
each test star. 

Figures ^ and ||b display the correlation between the 
manual classification given in the Atlas, in horizontal axes, 
and the classifications obtained with the metric distance 
and ANN, in vertical axes, respectively. Fig. ||c shows the 
correlation between metric distance and ANN to demon- 
strate the consistency of both classification methods. A 
line of slope unity is also plotted. These figures show a 
good agreement between automated methods and manual 
classification, confirming the classification in the Atlas. 
However, the results obtained with ANN are better than 
the classification with metric distance. 

Correlation analysis was used to evaluate the perfor- 
mance of the two classification methods. Table 2 summa- 
rizes the results, including the standard deviation, cr, and 
the correlation coefficient, r. The table shows the superior 
accuracy of the classification obtained with ANN, a — 1.1, 
over the metric distance, a = 1.4. Further analysis of 
the distribution of the classification errors indicates that 
46.6 % of the stars were correctly classified by ANN, i.e., in 
agreement with the spectral class in the Atlas, while only 
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Fig. 3. Results of classification: Metric Distance versus At- 
las (top), ANN versus Atlas (middle) and ANN versus Metric 
Distance (bottom) 



Table 2. Comparative Performance 





Metric Distance 


ANN 


ANN vs. 


Par am. 


vs. Catalog 


vs. Catalog 


Metric Distance 


(7 


1.375 


1.107 


1.144 


r 


0.982 


0.988 


0.988 
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35.6 % were correctly assigned using metric distance. Only 
six stars were classified by ANN with a discrepancy larger 
than 2 spectral subclasses. Detailed analysis of these ob- 
jects show a remarkable agreement between both methods. 
In addition, the classification vectors generated by ANN 
for these stars have well defined maxima. Figure ^ shows 
the histogram of the deviations between the classifications 
produced by ANN and metric distance and the classifica- 
tion given in the Atlas. The superior performance of ANN 
is evident in these distributions. 



80 



60 - 



40 



20 - 




ANN 

Metric Distance 





Error 



Fig. 4. Distribution of classification discrepancies for ANN 
and Metric Distance versus Atlas 



To verify that the ANN classifier gives a good estima- 
tion of Bayesian a posteriori probabilities, we have ana- 
lyzed the output distributions obtained for the complete 
Atlas. As a first test, the sum of the output distributions 
should be 1 for each sample star. The estimated mean 
value for the total set {K ~ 227) is 



K M 



-^5ZX]o,fc = 0.96 ±0.06. 



(7) 



k=l 



As a second test, the output distribution averaged over 
the test sample should be equal to the a priori probability 
distribution in the Atlas, P{Ci). Figure ^ compares the 
probability distribution in the Atlas with the averaged 
output distribution assigned by ANN, estimated as 



K K 



(8) 



fc=i 



k=l 



There is a remarkable agreement between both distribu- 
tions. The probability values obtained from the Atlas and 
estimated with ANN are summarized in Table 3 for several 
ranges of spectral types, confirming that ANN gives an ac- 
curate estimation of the Bayesian probabilities. The bias 



towards hot spectral types, intrinsic to the lUE archive, 
is also evident in this sample. 



Table 3. Probability distributions 



Class 


Atlas 


ANN 


OO - 


04 


0.048 


0.056 


05 - 


09 


0.141 


0.154 


BO - 


B4 


0.322 


0.291 


B5 - 


B9 


0.185 


0.190 


AO - 


A4 


0.132 


0.141 


A5 - 


A9 


0.066 


0.073 


FO - 


F4 


0.075 


0.063 


F5 - 


F9 


0.022 


0.008 


GO - 


G4 


0.004 


0.007 



0.10 



0.08 



0.06 



0.04 



0.02 



0.00 
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Fig. 5. Probability distributions of the spectral classes in the 
Atlas and estimated by ANN 



5. Conclusions 

The relative performance of two automated classification 
methods have been analyzed using lUE low-dispersion 
spectra of normal stars. The methods do not assume prior 
knowledge about the spectral types to be classified and 
the algorithms can be applied directly to the observed 
flux distributions. The analysis confirms the qualitative 
results obtained on the same sample, by using multivari- 
ate statistics on selected spectral features. 
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The simple metric distance gives a good classification, 

but the results produced with the ANN classifier are bet- 
ter. The accuracy obtained with this method is 1.1 spec- 
tral subclasses. ANN classifiers have several advantages: 
they estimate Bayesian probabilities for each spectral type 
and, in addition, it is possible to identify spectra with un- 
certain classifications. The method is robust enough to 
be used in case of spectra with missing information and 
further improvements can be obtained with lUE spectra 
by rejecting bad pixels as indicated in the quality flags 
associated with the spectral values. 

This research will be continued to derive physical pa- 
rameters by using stellar models in connection with ob- 
served spectra. Unsupervised ANN classification will be 
used on the same set. 
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